DIVCLUS-T: A monothetic divisive hierarchical clustering method

نویسندگان

  • Marie Chavent
  • Yves Lechevallier
  • Olivier Briant
چکیده

DIVCLUS-T is a divisive hierarchical clustering algorithm based on a monothetic bipartitional approach allowing the dendrogram of the hierarchy to be read as a decision tree. It is designed for either numerical or categorical data. Like the Ward agglomerative hierarchical clustering algorithm and the k-means partitioning algorithm, it is based on the minimization of the inertia criterion. However, unlike Ward and k-means, it provides a simple and natural interpretation of the clusters. The price paid by construction in terms of inertia by DIVCLUS-T for this additional interpretation is studied by applying the three algorithms on six databases from the UCI Machine Learning repository.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Monothetic divisive clustering with geographical constraints

DIVCLUS-T is a descendant hierarchical clustering algorithm based on a monothetic bipartitional approach allowing the dendrogram of the hierarchy to be read as a decision tree. We propose in this paper a new version of this method called C-DIVCLUS-T which is able to take contiguity constraints into account. We apply C-DIVCLUS-T to hydrological areas described by agricultural and environmental v...

متن کامل

Empirical Comparison of a Monothetic Divisive Clustering Method with the Ward and the k-means Clustering Methods

DIVCLUS-T is a descendant hierarchical clustering methods based on the same monothetic approach than segmentation but from an unsupervised point of view. The dendrogram of the hierarchy is easy to interpret and can be read as decision tree. We present DIVCLUS-T on a small numerical and a small categorical example. DIVCLUS-T is then compared with two polythetic clustering methods: the Ward ascen...

متن کامل

A monothetic clustering method

The proposed divisive clustering method performs simultaneously a hierarchy of a set of objects and a monothetic characterization of each cluster of the hierarchy. A division is performed according to the within-cluster inertia criterion which is minimized among the bipartitions induced by a set of binary questions. In order to improve the clustering, the algorithm revises at each step the divi...

متن کامل

Choosing the Number of Clusters in Monothetic Clustering

Monothetic clustering is a divisive clustering method based on recursive bipartitions of the data set determined by choosing splitting rules from any of the variables to conditionally optimally partition the multivariate responses. Like in other clustering methods, the choice of the number of clusters is important in this method. Connections between monothetic clustering and decision trees moti...

متن کامل

Dissimilarity measures for histogram-valued data and divisive clustering of symbolic objects

Contemporary datasets are becoming increasingly larger and more complex, while techniques to analyse them are becoming more and more inadequate. Thus, new methods are needed to handle these new types of data. This study introduces methods to cluster histogram-valued data. However, histogram-valued data are difficult to handle computationally because observations typically have a different numbe...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Computational Statistics & Data Analysis

دوره 52  شماره 

صفحات  -

تاریخ انتشار 2007